Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 4898 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 772 |
| Duplicate rows (%) | 15.8% |
| Total size in memory | 459.3 KiB |
| Average record size in memory | 96.0 B |
Variable types
| Numeric | 12 |
|---|
| Dataset has 772 (15.8%) duplicate rows | Duplicates |
residual sugar is highly correlated with density | High correlation |
chlorides is highly correlated with density and 1 other fields | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide and 1 other fields | High correlation |
density is highly correlated with residual sugar and 3 other fields | High correlation |
alcohol is highly correlated with chlorides and 1 other fields | High correlation |
residual sugar is highly correlated with density | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide and 1 other fields | High correlation |
density is highly correlated with residual sugar and 2 other fields | High correlation |
alcohol is highly correlated with density | High correlation |
residual sugar is highly correlated with density | High correlation |
density is highly correlated with residual sugar and 1 other fields | High correlation |
alcohol is highly correlated with density | High correlation |
residual sugar is highly correlated with density | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
density is highly correlated with residual sugar and 1 other fields | High correlation |
alcohol is highly correlated with density | High correlation |
Reproduction
| Analysis started | 2022-02-27 19:22:32.622666 |
|---|---|
| Analysis finished | 2022-02-27 19:22:47.221213 |
| Duration | 14.6 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
fixed acidity
Real number (ℝ≥0)
| Distinct | 68 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.854787668 |
| Minimum | 3.8 |
|---|---|
| Maximum | 14.2 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 3.8 |
|---|---|
| 5-th percentile | 5.6 |
| Q1 | 6.3 |
| median | 6.8 |
| Q3 | 7.3 |
| 95-th percentile | 8.3 |
| Maximum | 14.2 |
| Range | 10.4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.8438682277 |
|---|---|
| Coefficient of variation (CV) | 0.1231063993 |
| Kurtosis | 2.172178465 |
| Mean | 6.854787668 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.6477514746 |
| Sum | 33574.75 |
| Variance | 0.7121135857 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6.8 | 308 | 6.3% |
| 6.6 | 290 | 5.9% |
| 6.4 | 280 | 5.7% |
| 6.9 | 241 | 4.9% |
| 6.7 | 236 | 4.8% |
| 7 | 232 | 4.7% |
| 6.5 | 225 | 4.6% |
| 7.2 | 206 | 4.2% |
| 7.1 | 200 | 4.1% |
| 7.4 | 194 | 4.0% |
| Other values (58) | 2486 |
| Value | Count | Frequency (%) |
| 3.8 | 1 | < 0.1% |
| 3.9 | 1 | < 0.1% |
| 4.2 | 2 | < 0.1% |
| 4.4 | 3 | 0.1% |
| 4.5 | 1 | < 0.1% |
| 4.6 | 1 | < 0.1% |
| 4.7 | 5 | 0.1% |
| 4.8 | 9 | 0.2% |
| 4.9 | 7 | 0.1% |
| 5 | 24 |
| Value | Count | Frequency (%) |
| 14.2 | 1 | < 0.1% |
| 11.8 | 1 | < 0.1% |
| 10.7 | 2 | < 0.1% |
| 10.3 | 2 | < 0.1% |
| 10.2 | 1 | < 0.1% |
| 10 | 3 | 0.1% |
| 9.9 | 2 | < 0.1% |
| 9.8 | 8 | |
| 9.7 | 4 | |
| 9.6 | 5 |
volatile acidity
Real number (ℝ≥0)
| Distinct | 125 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2782411188 |
| Minimum | 0.08 |
|---|---|
| Maximum | 1.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 0.08 |
|---|---|
| 5-th percentile | 0.15 |
| Q1 | 0.21 |
| median | 0.26 |
| Q3 | 0.32 |
| 95-th percentile | 0.46 |
| Maximum | 1.1 |
| Range | 1.02 |
| Interquartile range (IQR) | 0.11 |
Descriptive statistics
| Standard deviation | 0.1007945484 |
|---|---|
| Coefficient of variation (CV) | 0.3622561211 |
| Kurtosis | 5.091625817 |
| Mean | 0.2782411188 |
| Median Absolute Deviation (MAD) | 0.06 |
| Skewness | 1.576979503 |
| Sum | 1362.825 |
| Variance | 0.01015954099 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.28 | 263 | 5.4% |
| 0.24 | 253 | 5.2% |
| 0.26 | 240 | 4.9% |
| 0.25 | 231 | 4.7% |
| 0.22 | 229 | 4.7% |
| 0.27 | 218 | 4.5% |
| 0.23 | 216 | 4.4% |
| 0.2 | 214 | 4.4% |
| 0.3 | 198 | 4.0% |
| 0.21 | 191 | 3.9% |
| Other values (115) | 2645 |
| Value | Count | Frequency (%) |
| 0.08 | 4 | 0.1% |
| 0.085 | 1 | < 0.1% |
| 0.09 | 1 | < 0.1% |
| 0.1 | 6 | 0.1% |
| 0.105 | 6 | 0.1% |
| 0.11 | 13 | 0.3% |
| 0.115 | 3 | 0.1% |
| 0.12 | 34 | |
| 0.125 | 3 | 0.1% |
| 0.13 | 44 |
| Value | Count | Frequency (%) |
| 1.1 | 1 | |
| 1.005 | 1 | |
| 0.965 | 1 | |
| 0.93 | 1 | |
| 0.91 | 1 | |
| 0.905 | 1 | |
| 0.85 | 1 | |
| 0.815 | 1 | |
| 0.785 | 1 | |
| 0.78 | 1 |
citric acid
Real number (ℝ≥0)
| Distinct | 87 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3341915067 |
| Minimum | 0 |
|---|---|
| Maximum | 1.66 |
| Zeros | 19 |
| Zeros (%) | 0.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.17 |
| Q1 | 0.27 |
| median | 0.32 |
| Q3 | 0.39 |
| 95-th percentile | 0.54 |
| Maximum | 1.66 |
| Range | 1.66 |
| Interquartile range (IQR) | 0.12 |
Descriptive statistics
| Standard deviation | 0.1210198042 |
|---|---|
| Coefficient of variation (CV) | 0.362127109 |
| Kurtosis | 6.174900657 |
| Mean | 0.3341915067 |
| Median Absolute Deviation (MAD) | 0.06 |
| Skewness | 1.281920398 |
| Sum | 1636.87 |
| Variance | 0.01464579301 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.3 | 307 | 6.3% |
| 0.28 | 282 | 5.8% |
| 0.32 | 257 | 5.2% |
| 0.34 | 225 | 4.6% |
| 0.29 | 223 | 4.6% |
| 0.26 | 219 | 4.5% |
| 0.27 | 216 | 4.4% |
| 0.49 | 215 | 4.4% |
| 0.31 | 200 | 4.1% |
| 0.33 | 183 | 3.7% |
| Other values (77) | 2571 |
| Value | Count | Frequency (%) |
| 0 | 19 | |
| 0.01 | 7 | 0.1% |
| 0.02 | 6 | 0.1% |
| 0.03 | 2 | < 0.1% |
| 0.04 | 12 | |
| 0.05 | 5 | 0.1% |
| 0.06 | 6 | 0.1% |
| 0.07 | 12 | |
| 0.08 | 4 | 0.1% |
| 0.09 | 12 |
| Value | Count | Frequency (%) |
| 1.66 | 1 | < 0.1% |
| 1.23 | 1 | < 0.1% |
| 1 | 5 | |
| 0.99 | 1 | < 0.1% |
| 0.91 | 2 | < 0.1% |
| 0.88 | 1 | < 0.1% |
| 0.86 | 1 | < 0.1% |
| 0.82 | 2 | < 0.1% |
| 0.81 | 2 | < 0.1% |
| 0.8 | 2 | < 0.1% |
residual sugar
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 310 |
|---|---|
| Distinct (%) | 6.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.391414863 |
| Minimum | 0.6 |
|---|---|
| Maximum | 65.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 0.6 |
|---|---|
| 5-th percentile | 1.1 |
| Q1 | 1.7 |
| median | 5.2 |
| Q3 | 9.9 |
| 95-th percentile | 15.7 |
| Maximum | 65.8 |
| Range | 65.2 |
| Interquartile range (IQR) | 8.2 |
Descriptive statistics
| Standard deviation | 5.072057784 |
|---|---|
| Coefficient of variation (CV) | 0.7935735502 |
| Kurtosis | 3.469820103 |
| Mean | 6.391414863 |
| Median Absolute Deviation (MAD) | 3.6 |
| Skewness | 1.077093756 |
| Sum | 31305.15 |
| Variance | 25.72577016 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.2 | 187 | 3.8% |
| 1.4 | 184 | 3.8% |
| 1.6 | 165 | 3.4% |
| 1.3 | 147 | 3.0% |
| 1.1 | 146 | 3.0% |
| 1.5 | 142 | 2.9% |
| 1.7 | 99 | 2.0% |
| 1.8 | 99 | 2.0% |
| 1 | 93 | 1.9% |
| 2 | 79 | 1.6% |
| Other values (300) | 3557 |
| Value | Count | Frequency (%) |
| 0.6 | 2 | < 0.1% |
| 0.7 | 7 | 0.1% |
| 0.8 | 25 | 0.5% |
| 0.9 | 39 | 0.8% |
| 0.95 | 4 | 0.1% |
| 1 | 93 | |
| 1.05 | 1 | < 0.1% |
| 1.1 | 146 | |
| 1.15 | 3 | 0.1% |
| 1.2 | 187 |
| Value | Count | Frequency (%) |
| 65.8 | 1 | |
| 31.6 | 2 | |
| 26.05 | 2 | |
| 23.5 | 1 | |
| 22.6 | 1 | |
| 22 | 2 | |
| 20.8 | 2 | |
| 20.7 | 2 | |
| 20.4 | 1 | |
| 20.3 | 1 |
| Distinct | 160 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.04577235606 |
| Minimum | 0.009 |
|---|---|
| Maximum | 0.346 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 0.009 |
|---|---|
| 5-th percentile | 0.027 |
| Q1 | 0.036 |
| median | 0.043 |
| Q3 | 0.05 |
| 95-th percentile | 0.067 |
| Maximum | 0.346 |
| Range | 0.337 |
| Interquartile range (IQR) | 0.014 |
Descriptive statistics
| Standard deviation | 0.02184796809 |
|---|---|
| Coefficient of variation (CV) | 0.4773179703 |
| Kurtosis | 37.56459971 |
| Mean | 0.04577235606 |
| Median Absolute Deviation (MAD) | 0.007 |
| Skewness | 5.023330683 |
| Sum | 224.193 |
| Variance | 0.0004773337098 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.044 | 201 | 4.1% |
| 0.036 | 200 | 4.1% |
| 0.042 | 184 | 3.8% |
| 0.04 | 182 | 3.7% |
| 0.046 | 181 | 3.7% |
| 0.048 | 174 | 3.6% |
| 0.047 | 171 | 3.5% |
| 0.045 | 170 | 3.5% |
| 0.05 | 170 | 3.5% |
| 0.034 | 168 | 3.4% |
| Other values (150) | 3097 |
| Value | Count | Frequency (%) |
| 0.009 | 1 | < 0.1% |
| 0.012 | 1 | < 0.1% |
| 0.013 | 1 | < 0.1% |
| 0.014 | 4 | 0.1% |
| 0.015 | 4 | 0.1% |
| 0.016 | 5 | 0.1% |
| 0.017 | 5 | 0.1% |
| 0.018 | 10 | |
| 0.019 | 9 | |
| 0.02 | 16 |
| Value | Count | Frequency (%) |
| 0.346 | 1 | |
| 0.301 | 1 | |
| 0.29 | 1 | |
| 0.271 | 1 | |
| 0.255 | 1 | |
| 0.244 | 1 | |
| 0.24 | 1 | |
| 0.239 | 1 | |
| 0.217 | 1 | |
| 0.212 | 1 |
| Distinct | 132 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.30808493 |
| Minimum | 2 |
|---|---|
| Maximum | 289 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 11 |
| Q1 | 23 |
| median | 34 |
| Q3 | 46 |
| 95-th percentile | 63 |
| Maximum | 289 |
| Range | 287 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 17.00713733 |
|---|---|
| Coefficient of variation (CV) | 0.4816782716 |
| Kurtosis | 11.46634243 |
| Mean | 35.30808493 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.406744921 |
| Sum | 172939 |
| Variance | 289.24272 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29 | 160 | 3.3% |
| 31 | 132 | 2.7% |
| 26 | 129 | 2.6% |
| 35 | 129 | 2.6% |
| 34 | 128 | 2.6% |
| 36 | 127 | 2.6% |
| 24 | 118 | 2.4% |
| 28 | 112 | 2.3% |
| 33 | 112 | 2.3% |
| 25 | 111 | 2.3% |
| Other values (122) | 3640 |
| Value | Count | Frequency (%) |
| 2 | 1 | < 0.1% |
| 3 | 10 | 0.2% |
| 4 | 11 | 0.2% |
| 5 | 25 | |
| 6 | 32 | |
| 7 | 25 | |
| 8 | 35 | |
| 9 | 29 | |
| 10 | 55 | |
| 11 | 45 |
| Value | Count | Frequency (%) |
| 289 | 1 | |
| 146.5 | 1 | |
| 138.5 | 1 | |
| 131 | 1 | |
| 128 | 1 | |
| 124 | 1 | |
| 122.5 | 1 | |
| 118.5 | 1 | |
| 112 | 1 | |
| 110 | 1 |
| Distinct | 251 |
|---|---|
| Distinct (%) | 5.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 138.3606574 |
| Minimum | 9 |
|---|---|
| Maximum | 440 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 9 |
|---|---|
| 5-th percentile | 75 |
| Q1 | 108 |
| median | 134 |
| Q3 | 167 |
| 95-th percentile | 212 |
| Maximum | 440 |
| Range | 431 |
| Interquartile range (IQR) | 59 |
Descriptive statistics
| Standard deviation | 42.49806455 |
|---|---|
| Coefficient of variation (CV) | 0.3071542543 |
| Kurtosis | 0.5718532334 |
| Mean | 138.3606574 |
| Median Absolute Deviation (MAD) | 29 |
| Skewness | 0.3907098417 |
| Sum | 677690.5 |
| Variance | 1806.085491 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 111 | 69 | 1.4% |
| 113 | 61 | 1.2% |
| 117 | 57 | 1.2% |
| 118 | 55 | 1.1% |
| 128 | 54 | 1.1% |
| 114 | 54 | 1.1% |
| 150 | 54 | 1.1% |
| 122 | 54 | 1.1% |
| 124 | 53 | 1.1% |
| 140 | 52 | 1.1% |
| Other values (241) | 4335 |
| Value | Count | Frequency (%) |
| 9 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 18 | 2 | |
| 19 | 1 | < 0.1% |
| 21 | 1 | < 0.1% |
| 24 | 3 | |
| 25 | 1 | < 0.1% |
| 26 | 1 | < 0.1% |
| 28 | 4 | |
| 29 | 2 |
| Value | Count | Frequency (%) |
| 440 | 1 | |
| 366.5 | 1 | |
| 344 | 1 | |
| 313 | 1 | |
| 307.5 | 1 | |
| 303 | 1 | |
| 294 | 1 | |
| 282 | 1 | |
| 272 | 2 | |
| 260 | 1 |
| Distinct | 890 |
|---|---|
| Distinct (%) | 18.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9940273765 |
| Minimum | 0.98711 |
|---|---|
| Maximum | 1.03898 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 0.98711 |
|---|---|
| 5-th percentile | 0.9896385 |
| Q1 | 0.9917225 |
| median | 0.99374 |
| Q3 | 0.9961 |
| 95-th percentile | 0.999 |
| Maximum | 1.03898 |
| Range | 0.05187 |
| Interquartile range (IQR) | 0.0043775 |
Descriptive statistics
| Standard deviation | 0.002990906917 |
|---|---|
| Coefficient of variation (CV) | 0.003008877811 |
| Kurtosis | 9.793806911 |
| Mean | 0.9940273765 |
| Median Absolute Deviation (MAD) | 0.00214 |
| Skewness | 0.9777730049 |
| Sum | 4868.74609 |
| Variance | 8.945524186 × 10-6 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.992 | 64 | 1.3% |
| 0.9928 | 61 | 1.2% |
| 0.9932 | 53 | 1.1% |
| 0.993 | 52 | 1.1% |
| 0.9934 | 50 | 1.0% |
| 0.9938 | 49 | 1.0% |
| 0.9927 | 47 | 1.0% |
| 0.9944 | 46 | 0.9% |
| 0.9948 | 45 | 0.9% |
| 0.9954 | 44 | 0.9% |
| Other values (880) | 4387 |
| Value | Count | Frequency (%) |
| 0.98711 | 1 | |
| 0.98713 | 1 | |
| 0.98722 | 1 | |
| 0.9874 | 1 | |
| 0.98742 | 2 | |
| 0.98746 | 2 | |
| 0.98758 | 1 | |
| 0.98774 | 1 | |
| 0.98779 | 1 | |
| 0.98794 | 2 |
| Value | Count | Frequency (%) |
| 1.03898 | 1 | |
| 1.0103 | 2 | |
| 1.00295 | 2 | |
| 1.00241 | 1 | |
| 1.0024 | 1 | |
| 1.00196 | 1 | |
| 1.00182 | 1 | |
| 1.0017 | 2 | |
| 1.0012 | 1 | |
| 1.00118 | 1 |
pH
Real number (ℝ≥0)
| Distinct | 103 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.188266639 |
| Minimum | 2.72 |
|---|---|
| Maximum | 3.82 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 2.72 |
|---|---|
| 5-th percentile | 2.96 |
| Q1 | 3.09 |
| median | 3.18 |
| Q3 | 3.28 |
| 95-th percentile | 3.46 |
| Maximum | 3.82 |
| Range | 1.1 |
| Interquartile range (IQR) | 0.19 |
Descriptive statistics
| Standard deviation | 0.1510005996 |
|---|---|
| Coefficient of variation (CV) | 0.04736134605 |
| Kurtosis | 0.5307749515 |
| Mean | 3.188266639 |
| Median Absolute Deviation (MAD) | 0.1 |
| Skewness | 0.4577825459 |
| Sum | 15616.13 |
| Variance | 0.02280118108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3.14 | 172 | 3.5% |
| 3.16 | 164 | 3.3% |
| 3.22 | 146 | 3.0% |
| 3.19 | 145 | 3.0% |
| 3.18 | 138 | 2.8% |
| 3.2 | 137 | 2.8% |
| 3.15 | 136 | 2.8% |
| 3.08 | 136 | 2.8% |
| 3.1 | 135 | 2.8% |
| 3.12 | 134 | 2.7% |
| Other values (93) | 3455 |
| Value | Count | Frequency (%) |
| 2.72 | 1 | < 0.1% |
| 2.74 | 1 | < 0.1% |
| 2.77 | 1 | < 0.1% |
| 2.79 | 3 | 0.1% |
| 2.8 | 3 | 0.1% |
| 2.82 | 1 | < 0.1% |
| 2.83 | 4 | |
| 2.84 | 1 | < 0.1% |
| 2.85 | 9 | |
| 2.86 | 9 |
| Value | Count | Frequency (%) |
| 3.82 | 1 | < 0.1% |
| 3.81 | 1 | < 0.1% |
| 3.8 | 2 | |
| 3.79 | 1 | < 0.1% |
| 3.77 | 2 | |
| 3.76 | 2 | |
| 3.75 | 2 | |
| 3.74 | 2 | |
| 3.72 | 3 | |
| 3.7 | 1 | < 0.1% |
sulphates
Real number (ℝ≥0)
| Distinct | 79 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4898468763 |
| Minimum | 0.22 |
|---|---|
| Maximum | 1.08 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 0.22 |
|---|---|
| 5-th percentile | 0.34 |
| Q1 | 0.41 |
| median | 0.47 |
| Q3 | 0.55 |
| 95-th percentile | 0.71 |
| Maximum | 1.08 |
| Range | 0.86 |
| Interquartile range (IQR) | 0.14 |
Descriptive statistics
| Standard deviation | 0.1141258339 |
|---|---|
| Coefficient of variation (CV) | 0.2329826717 |
| Kurtosis | 1.59092963 |
| Mean | 0.4898468763 |
| Median Absolute Deviation (MAD) | 0.07 |
| Skewness | 0.9771936833 |
| Sum | 2399.27 |
| Variance | 0.01302470597 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.5 | 249 | 5.1% |
| 0.46 | 225 | 4.6% |
| 0.44 | 216 | 4.4% |
| 0.38 | 214 | 4.4% |
| 0.42 | 181 | 3.7% |
| 0.48 | 179 | 3.7% |
| 0.45 | 178 | 3.6% |
| 0.47 | 172 | 3.5% |
| 0.4 | 168 | 3.4% |
| 0.54 | 167 | 3.4% |
| Other values (69) | 2949 |
| Value | Count | Frequency (%) |
| 0.22 | 1 | < 0.1% |
| 0.23 | 1 | < 0.1% |
| 0.25 | 4 | 0.1% |
| 0.26 | 4 | 0.1% |
| 0.27 | 13 | 0.3% |
| 0.28 | 13 | 0.3% |
| 0.29 | 16 | 0.3% |
| 0.3 | 31 | |
| 0.31 | 35 | |
| 0.32 | 54 |
| Value | Count | Frequency (%) |
| 1.08 | 1 | < 0.1% |
| 1.06 | 1 | < 0.1% |
| 1.01 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 0.99 | 1 | < 0.1% |
| 0.98 | 6 | |
| 0.97 | 1 | < 0.1% |
| 0.96 | 3 | |
| 0.95 | 5 | |
| 0.94 | 2 | < 0.1% |
| Distinct | 103 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.51426705 |
| Minimum | 8 |
|---|---|
| Maximum | 14.2 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 8.9 |
| Q1 | 9.5 |
| median | 10.4 |
| Q3 | 11.4 |
| 95-th percentile | 12.7 |
| Maximum | 14.2 |
| Range | 6.2 |
| Interquartile range (IQR) | 1.9 |
Descriptive statistics
| Standard deviation | 1.230620568 |
|---|---|
| Coefficient of variation (CV) | 0.1170429248 |
| Kurtosis | -0.6984253278 |
| Mean | 10.51426705 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4873419932 |
| Sum | 51498.88 |
| Variance | 1.514426982 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9.4 | 229 | 4.7% |
| 9.5 | 228 | 4.7% |
| 9.2 | 199 | 4.1% |
| 9 | 185 | 3.8% |
| 10 | 162 | 3.3% |
| 10.5 | 160 | 3.3% |
| 11 | 158 | 3.2% |
| 10.4 | 153 | 3.1% |
| 9.1 | 144 | 2.9% |
| 9.8 | 136 | 2.8% |
| Other values (93) | 3144 |
| Value | Count | Frequency (%) |
| 8 | 2 | < 0.1% |
| 8.4 | 3 | 0.1% |
| 8.5 | 9 | 0.2% |
| 8.6 | 23 | 0.5% |
| 8.7 | 78 | 1.6% |
| 8.8 | 107 | |
| 8.9 | 95 | |
| 9 | 185 | |
| 9.1 | 144 | |
| 9.2 | 199 |
| Value | Count | Frequency (%) |
| 14.2 | 1 | < 0.1% |
| 14.05 | 1 | < 0.1% |
| 14 | 5 | 0.1% |
| 13.9 | 3 | 0.1% |
| 13.8 | 2 | < 0.1% |
| 13.7 | 7 | 0.1% |
| 13.6 | 9 | |
| 13.55 | 1 | < 0.1% |
| 13.5 | 12 | |
| 13.4 | 20 |
quality
Real number (ℝ≥0)
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.877909351 |
| Minimum | 3 |
|---|---|
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 38.4 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 5 |
| median | 6 |
| Q3 | 6 |
| 95-th percentile | 7 |
| Maximum | 9 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.885638575 |
|---|---|
| Coefficient of variation (CV) | 0.1506723772 |
| Kurtosis | 0.2165258272 |
| Mean | 5.877909351 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.1557963977 |
| Sum | 28790 |
| Variance | 0.7843556855 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 2198 | |
| 5 | 1457 | |
| 7 | 880 | |
| 8 | 175 | 3.6% |
| 4 | 163 | 3.3% |
| 3 | 20 | 0.4% |
| 9 | 5 | 0.1% |
| Value | Count | Frequency (%) |
| 3 | 20 | 0.4% |
| 4 | 163 | 3.3% |
| 5 | 1457 | |
| 6 | 2198 | |
| 7 | 880 | |
| 8 | 175 | 3.6% |
| 9 | 5 | 0.1% |
| Value | Count | Frequency (%) |
| 9 | 5 | 0.1% |
| 8 | 175 | 3.6% |
| 7 | 880 | |
| 6 | 2198 | |
| 5 | 1457 | |
| 4 | 163 | 3.3% |
| 3 | 20 | 0.4% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7.0 | 0.27 | 0.36 | 20.7 | 0.045 | 45.0 | 170.0 | 1.0010 | 3.00 | 0.45 | 8.8 | 6 |
| 1 | 6.3 | 0.30 | 0.34 | 1.6 | 0.049 | 14.0 | 132.0 | 0.9940 | 3.30 | 0.49 | 9.5 | 6 |
| 2 | 8.1 | 0.28 | 0.40 | 6.9 | 0.050 | 30.0 | 97.0 | 0.9951 | 3.26 | 0.44 | 10.1 | 6 |
| 3 | 7.2 | 0.23 | 0.32 | 8.5 | 0.058 | 47.0 | 186.0 | 0.9956 | 3.19 | 0.40 | 9.9 | 6 |
| 4 | 7.2 | 0.23 | 0.32 | 8.5 | 0.058 | 47.0 | 186.0 | 0.9956 | 3.19 | 0.40 | 9.9 | 6 |
| 5 | 8.1 | 0.28 | 0.40 | 6.9 | 0.050 | 30.0 | 97.0 | 0.9951 | 3.26 | 0.44 | 10.1 | 6 |
| 6 | 6.2 | 0.32 | 0.16 | 7.0 | 0.045 | 30.0 | 136.0 | 0.9949 | 3.18 | 0.47 | 9.6 | 6 |
| 7 | 7.0 | 0.27 | 0.36 | 20.7 | 0.045 | 45.0 | 170.0 | 1.0010 | 3.00 | 0.45 | 8.8 | 6 |
| 8 | 6.3 | 0.30 | 0.34 | 1.6 | 0.049 | 14.0 | 132.0 | 0.9940 | 3.30 | 0.49 | 9.5 | 6 |
| 9 | 8.1 | 0.22 | 0.43 | 1.5 | 0.044 | 28.0 | 129.0 | 0.9938 | 3.22 | 0.45 | 11.0 | 6 |
Last rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4888 | 6.8 | 0.220 | 0.36 | 1.20 | 0.052 | 38.0 | 127.0 | 0.99330 | 3.04 | 0.54 | 9.2 | 5 |
| 4889 | 4.9 | 0.235 | 0.27 | 11.75 | 0.030 | 34.0 | 118.0 | 0.99540 | 3.07 | 0.50 | 9.4 | 6 |
| 4890 | 6.1 | 0.340 | 0.29 | 2.20 | 0.036 | 25.0 | 100.0 | 0.98938 | 3.06 | 0.44 | 11.8 | 6 |
| 4891 | 5.7 | 0.210 | 0.32 | 0.90 | 0.038 | 38.0 | 121.0 | 0.99074 | 3.24 | 0.46 | 10.6 | 6 |
| 4892 | 6.5 | 0.230 | 0.38 | 1.30 | 0.032 | 29.0 | 112.0 | 0.99298 | 3.29 | 0.54 | 9.7 | 5 |
| 4893 | 6.2 | 0.210 | 0.29 | 1.60 | 0.039 | 24.0 | 92.0 | 0.99114 | 3.27 | 0.50 | 11.2 | 6 |
| 4894 | 6.6 | 0.320 | 0.36 | 8.00 | 0.047 | 57.0 | 168.0 | 0.99490 | 3.15 | 0.46 | 9.6 | 5 |
| 4895 | 6.5 | 0.240 | 0.19 | 1.20 | 0.041 | 30.0 | 111.0 | 0.99254 | 2.99 | 0.46 | 9.4 | 6 |
| 4896 | 5.5 | 0.290 | 0.30 | 1.10 | 0.022 | 20.0 | 110.0 | 0.98869 | 3.34 | 0.38 | 12.8 | 7 |
| 4897 | 6.0 | 0.210 | 0.38 | 0.80 | 0.020 | 22.0 | 98.0 | 0.98941 | 3.26 | 0.32 | 11.8 | 6 |
Most frequently occurring
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 423 | 7.0 | 0.15 | 0.28 | 14.7 | 0.051 | 29.0 | 149.0 | 0.99792 | 2.96 | 0.39 | 9.0 | 7 | 8 |
| 557 | 7.3 | 0.19 | 0.27 | 13.9 | 0.057 | 45.0 | 155.0 | 0.99807 | 2.94 | 0.41 | 8.8 | 8 | 8 |
| 335 | 6.8 | 0.18 | 0.30 | 12.8 | 0.062 | 19.0 | 171.0 | 0.99808 | 3.00 | 0.52 | 9.0 | 7 | 7 |
| 589 | 7.4 | 0.16 | 0.30 | 13.7 | 0.056 | 33.0 | 168.0 | 0.99825 | 2.90 | 0.44 | 8.7 | 7 | 7 |
| 588 | 7.4 | 0.16 | 0.27 | 15.5 | 0.050 | 25.0 | 135.0 | 0.99840 | 2.90 | 0.43 | 8.7 | 7 | 6 |
| 592 | 7.4 | 0.19 | 0.30 | 12.8 | 0.053 | 48.5 | 229.0 | 0.99860 | 3.14 | 0.49 | 9.1 | 7 | 6 |
| 593 | 7.4 | 0.19 | 0.31 | 14.5 | 0.045 | 39.0 | 193.0 | 0.99860 | 3.10 | 0.50 | 9.2 | 6 | 6 |
| 641 | 7.6 | 0.20 | 0.30 | 14.2 | 0.056 | 53.0 | 212.5 | 0.99900 | 3.14 | 0.46 | 8.9 | 8 | 6 |
| 28 | 5.7 | 0.22 | 0.20 | 16.0 | 0.044 | 41.0 | 113.0 | 0.99862 | 3.22 | 0.46 | 8.9 | 6 | 5 |
| 110 | 6.2 | 0.23 | 0.36 | 17.2 | 0.039 | 37.0 | 130.0 | 0.99946 | 3.23 | 0.43 | 8.8 | 6 | 5 |